Prosodic Reading Style Simulation for Text-to-Speech Synthesis

نویسندگان

Oliver Jokisch

Hans Kruschke

Rüdiger Hoffmann

چکیده

The simulation of different reading styles (mainly by adapting prosodic parameters) can improve the naturalness of synthetic speech and supports a more intelligent human machine interaction. The article exemplarily investigates the reading styles News and Tale. For comparison, all examined texts contained the same genre-neutral paragraphs which have been read without a specific style instruction: Normal but also faster, slower, rather monotone or more emotional which led to corresponding artificial styles. The measured original intonation and durations style patterns control a diphone synthesizer (mapped contours). Additionally, the patterns are used to train a neural network (NN) model. Within two separate listening tests, different stimuli presented as original signal/style, respectively, with mapped or NN generated prosodic contours have been evaluated. The results show that both, original utterances and artificial styles are basically perceived in their intended reading styles. Some reciprocal confusions indicate the similarities between different styles like News and Fast, Tale and Slow as well as Tale and Expressive. The confusions are more likely for synthetic speech. To produce e. g. the complex style Tale, different features of the prosodic variations Slow and Expressive are combined. The training method for the synthetic styles requires a further improvement.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Grammar Based Approach to Style Specific Phrase Prediction

We present an approach to style specific phrasing for Text-toSpeech (TTS) systems. We formulate the problem of phrase break prediction (or phrasing) as generation of a sequence of breaks (B) and non-breaks (NB) after each word in a sentence. We use prosodic breaks in speech data to build shallow parses over corresponding text. We then learn a grammar that can predict these shallow prosodic pars...

متن کامل

Towards the adaptation of prosodic models for expressive text-to-speech synthesis

This paper presents a preliminary study whose main aim is to characterize four distinct speaking styles according to a limited set of prosodic features, including the length of prosodic phrases (AP and IP), the distribution of stressed syllables, pitch register span, the duration of silent pauses, etc. The analysis was performed using semi-automatic procedures on a corpus consisting of 30 minut...

متن کامل

Prosodic analysis of storytelling discourse modes and narrative situations oriented to text-to-speech synthesis

The generation of synthetic speech with a certain degree of expressiveness has been successful for some particular applications or speaking styles (e.g. emotions). In this context, there is a particular speaking style with subtle speech nuances that may be of great interest for delivering expressive speech: the storytelling style. The purpose of this paper is to define a first step towards deve...

متن کامل

Uncovering Latent Style Factors for Expressive Speech Synthesis

Prosodic modeling is a core problem in speech synthesis. The key challenge is producing desirable prosody from textual input containing only phonetic information. In this preliminary study, we introduce the concept of “style tokens” in Tacotron, a recently proposed end-to-end neural speech synthesis model. Using style tokens, we aim to extract independent prosodic styles from training data. We ...

متن کامل

Individual and contextual variations of prosodic parameters

This is a summary of variabilities and co-variation of prosodic parameters found in our studies of text reading and in the development of text-to-speech synthesis. In addition to F0, duration and intensity, the survey includes aspects of voice production and perception. The role of sub-glottal pressure is discussed. Speech parameters have been correlated with our continuously graded prominence ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Prosodic Reading Style Simulation for Text-to-Speech Synthesis

نویسندگان

چکیده

منابع مشابه

A Grammar Based Approach to Style Specific Phrase Prediction

Towards the adaptation of prosodic models for expressive text-to-speech synthesis

Prosodic analysis of storytelling discourse modes and narrative situations oriented to text-to-speech synthesis

Uncovering Latent Style Factors for Expressive Speech Synthesis

Individual and contextual variations of prosodic parameters

عنوان ژورنال:

اشتراک گذاری